Search CORE

15 research outputs found

Spectral Methods for Correlated Topic Models

Author: Anandkumar Animashree
Arabshahi Forough
Publication venue
Publication date: 29/05/2016
Field of study

In this paper, we propose guaranteed spectral methods for learning a broad range of topic models, which generalize the popular Latent Dirichlet Allocation (LDA). We overcome the limitation of LDA to incorporate arbitrary topic correlations, by assuming that the hidden topic proportions are drawn from a flexible class of Normalized Infinitely Divisible (NID) distributions. NID distributions are generated through the process of normalizing a family of independent Infinitely Divisible (ID) random variables. The Dirichlet distribution is a special case obtained by normalizing a set of Gamma random variables. We prove that this flexible topic model class can be learned via spectral methods using only moments up to the third order, with (low order) polynomial sample and computational complexity. The proof is based on a key new technique derived here that allows us to diagonalize the moments of the NID distribution through an efficient procedure that requires evaluating only univariate integrals, despite the fact that we are handling high dimensional multivariate moments. In order to assess the performance of our proposed Latent NID topic model, we use two real datasets of articles collected from New York Times and Pubmed. Our experiments yield improved perplexity on both datasets compared with the baseline

arXiv.org e-Print Archive

eScholarship - University of California

Caltech Authors

Combining Symbolic Expressions and Black-box Function Evaluations in Neural Programs

Author: Anandkumar Animashree
Arabshahi Forough
Singh Sameer
Publication venue
Publication date: 26/04/2018
Field of study

Neural programming involves training neural networks to learn programs, mathematics, or logic from data. Previous works have failed to achieve good generalization performance, especially on problems and programs with high complexity or on large domains. This is because they mostly rely either on black-box function evaluations that do not capture the structure of the program, or on detailed execution traces that are expensive to obtain, and hence the training data has poor coverage of the domain under consideration. We present a novel framework that utilizes black-box function evaluations, in conjunction with symbolic expressions that define relationships between the given functions. We employ tree LSTMs to incorporate the structure of the symbolic expression trees. We use tree encoding for numbers present in function evaluation data, based on their decimal representation. We present an evaluation benchmark for this task to demonstrate our proposed model combines symbolic reasoning and function evaluation in a fruitful manner, obtaining high accuracies in our experiments. Our framework generalizes significantly better to expressions of higher depth and is able to fill partial equations with valid completions.Comment: Published as a conference paper at the sixth International Conference on Learning Representations (ICLR), 201

arXiv.org e-Print Archive

Caltech Authors

Are you going to the party: depends, who else is coming? [Learning hidden group dynamics via conditional latent tree models]

Author: Anandkumar Animashree
Arabshahi Forough
Butts Carter T.
Fitshugh Sean M.
Huang Furong
Publication venue
Publication date: 01/11/2015
Field of study

Scalable probabilistic modeling and prediction in high dimensional multivariate time-series is a challenging problem, particularly for systems with hidden sources of dependence and/or homogeneity. Examples of such problems include dynamic social networks with co-evolving nodes and edges and dynamic student learning in online courses. Here, we address these problems through the discovery of hierarchical latent groups. We introduce a family of Conditional Latent Tree Models (CLTM), in which tree-structured latent variables incorporate the unknown groups. The latent tree itself is conditioned on observed covariates such as seasonality, historical activity, and node attributes. We propose a statistically efficient framework for learning both the hierarchical tree structure and the parameters of the CLTM. We demonstrate competitive performance in multiple real world datasets from different domains. These include a dataset on students' attempts at answering questions in a psychology MOOC, Twitter users participating in an emergency management discussion and interacting with one another, and windsurfers interacting on a beach in Southern California. In addition, our modeling framework provides valuable and interpretable information about the hidden group structures and their effect on the evolution of the time series

arXiv.org e-Print Archive

Crossref

Caltech Authors

Dividing and conquering a BlackBox to a mixture of interpretable models: route, interpret, repeat

Author: Arabshahi Forough
Batmanghelich Kayhan
Ghosh Shantanu
Yu Ke
Publication venue
Publication date: 29/06/2023
Field of study

7R01HL141813-06 - NIH/National Heart, Lung, and Blood Institute; Optum Labs, Inc.; NIH/National Institutes of HealthAccepted manuscrip

Boston University Institutional Repository (OpenBU)

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat

Author: Arabshahi Forough
Batmanghelich Kayhan
Ghosh Shantanu
Yu Ke
Publication venue
Publication date: 27/04/2023
Field of study

ML model design either starts with an interpretable model or a Blackbox and explains it post hoc. Blackbox models are flexible but difficult to explain, while interpretable models are inherently explainable. Yet, interpretable models require extensive ML knowledge and tend to be less flexible and underperforming than their Blackbox variants. This paper aims to blur the distinction between a post hoc explanation of a Blackbox and constructing interpretable models. Beginning with a Blackbox, we iteratively carve out a mixture of interpretable experts (MoIE) and a residual network. Each interpretable model specializes in a subset of samples and explains them using First Order Logic (FOL), providing basic reasoning on concepts from the Blackbox. We route the remaining samples through a flexible residual. We repeat the method on the residual network until all the interpretable models explain the desired proportion of data. Our extensive experiments show that our route, interpret, and repeat approach (1) identifies a diverse set of instance-specific concepts with high concept completeness via MoIE without compromising in performance, (2) identifies the relatively ``harder'' samples to explain via residuals, (3) outperforms the interpretable by-design models by significant margins during test-time interventions, and (4) fixes the shortcut learned by the original Blackbox. The code for MoIE is publicly available at: https://github.com/batmanlab/ICML-2023-Route-interpret-repeat.Comment: ICML, 202

arXiv.org e-Print Archive

Recommended from our members

Learning Latent Hierarchical Structures via Probabilistic Models and Deep Learning

Author: Arabshahi Forough
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Hierarchical structures arise in many real world applications and domains. For example, in social networks people’s relationships and the groups to which they belong form a hierarchy. In natural language and computer programs, parse trees (which have a hierarchical structure) are used to represent the compositionality of expressions. These hierarchies strongly affect the statistics and the behavior of the data. Hence, it is important to develop algorithms that take these structures into account when modeling such data. Apart from these hierarchical structures, some datasets are best explained with hierarchical models even though there is no apparent hierarchy in the data itself. For instance when modeling the occurrence of words in a document, it is more realistic to assume that the words are drawn in a hierarchical manner from a topic distribution rather than independently from a single topic. In this dissertation, we focus on capturing these hierarchies and leveraging them for modeling high dimensional datasets.Hierarchical structures underlying the data are either observed or latent. For example in the context of computer programs, the syntax tree is inherent to the program and is therefore observed. On the other hand, the statistical dependence of a social network’s users is latent. In this dissertation, we study both types of hierarchies and develop models under both struc- tures because they both arise in many applications and are equally important. Nevertheless, capturing latent hierarchical structures is more challenging. We develop novel probabilistic models to capture latent hierarchies and present statistically efficient and provably consistent parameter learning algorithms for them. When capturing observed hierarchical structures we develop deep learning models that learn low-dimensional continuous representations for the discrete symbols and variables

eScholarship - University of California

Are you going to the party: depends, who else is coming? [Learning hidden group dynamics via conditional latent tree models]

Author: Arabshahi Forough,
Publication venue
Publication date: 14/09/2016
Field of study

Ezid